A Hybrid MPI–OpenMP Parallel Algorithm and Performance Analysis for an Ensemble Square Root Filter Designed for Multiscale Observations
نویسندگان
چکیده
A hybrid parallel scheme for the ensemble square root filter (EnSRF) suitable for parallel assimilation of multiscale observations, including those from dense observational networks such as those of radar, is developed based on the domain decomposition strategy. The scheme handles internode communication through a message passing interface (MPI) and the communication within shared-memory nodes via Open Multiprocessing (OpenMP) threads. It also supports pure MPI and pure OpenMPmodes. The parallel framework can accommodate high-volume remote-sensed radar (or satellite) observations as well as conventional observations that usually have larger covariance localization radii. The performance of the parallel algorithm has been tested with simulated and real radar data. The parallel program shows good scalability in pure MPI and hybrid MPI–OpenMP modes, while pure OpenMP runs exhibit limited scalability on a symmetric shared-memory system. It is found that inMPImode, better parallel performance is achieved with domain decomposition configurations in which the leading dimension of the state variable arrays is larger, because this configuration allows for more efficient memory access. Given a fixed amount of computing resources, the hybrid parallel mode is preferred to pure MPI mode on supercomputers with nodes containing shared-memory cores. The overall performance is also affected by factors such as the cache size, memory bandwidth, and the networking topology. Tests with a real data case with a large number of radars confirm that the parallel data assimilation can be done on amulticore supercomputer with a significant speedup compared to the serial data assimilation algorithm.
منابع مشابه
A Hybrid MPI/OpenMP Parallel Algorithm and Performance Analysis for an Ensemble 4 Square Root Filter Designed for Multi-scale Observations
31 A hybrid parallel scheme for the ensemble square root filter (EnSRF) suitable for parallel 32 assimilation of multi-scale observations including those from dense observational networks such 33 as those of radar is developed based on the domain decomposition strategy. The scheme handles 34 inter-node communication through message passing interface (MPI), and the communication 35 within shared...
متن کاملAn OpenMP+/MPI Recursive Least Squares pipelined Parallel Algorithm based on the square root and extended version Information Filter for heterogeneous parallel systems
This article describes a parallel OpenMP and/or MPI implementation of an algorithm for solving the Recursive Least Squares (RLS) problem, suitable for heterogeneous parallel systems. A model for automatic/adaptive load balancing is proposed for either homogeneous or heterogeneous parallel systems. The algorithm is based on a variant of the Kalman Filter named the square root and extended versio...
متن کاملParallel computing using MPI and OpenMP on self-configured platform, UMZHPC.
Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...
متن کاملPerformance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Clusters
The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore clusters provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchma...
متن کاملHybrid Programming and Performance for Beam Propagation Modeling
We examined hybrid parallel infrastructures in order to ensure performance and scalability for beam propagation modeling as we move toward extreme-scale systems. Using an MPI programming interface for parallel algorithms, we expanded the capability of our existing electromagnetic solver to a hybrid (MPI/shared-memory) model that can potentially use the computer resources on future-generation co...
متن کامل